ANE support #159

eiln · 2023-06-21T05:32:28Z

The driver's pretty simple and hasn't given me issues the past 4 months. The invalidation workaround isn't the prettiest, but I can't think of a better way; I really don't think this single oddball case warrants the dt-level changes needed to handle it in iommu. Tested on J293AP. T6000 ane0/ane2 worked on m1n1.

Change the error type of the constructors of `Arc` and `UniqueArc` to be `AllocError` instead of `Error`. This makes the API more clear as to what can go wrong when calling `try_new` or its variants. Signed-off-by: Benno Lossin <[email protected]> Reviewed-by: Andreas Hindborg <[email protected]> Reviewed-by: Alice Ryhl <[email protected]> Reviewed-by: Gary Guo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

This function mirrors `UnsafeCell::raw_get`. It avoids creating a reference and allows solely using raw pointers. The `pin-init` API will be using this, since uninitialized memory requires raw pointers. Signed-off-by: Benno Lossin <[email protected]> Reviewed-by: Gary Guo <[email protected]> Reviewed-by: Andreas Hindborg <[email protected]> Reviewed-by: Alice Ryhl <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

This API is used to facilitate safe pinned initialization of structs. It replaces cumbersome `unsafe` manual initialization with elegant safe macro invocations. Due to the size of this change it has been split into six commits: 1. This commit introducing the basic public interface: traits and functions to represent and create initializers. 2. Adds the `#[pin_data]`, `pin_init!`, `try_pin_init!`, `init!` and `try_init!` macros along with their internal types. 3. Adds the `InPlaceInit` trait that allows using an initializer to create an object inside of a `Box<T>` and other smart pointers. 4. Adds the `PinnedDrop` trait and adds macro support for it in the `#[pin_data]` macro. 5. Adds the `stack_pin_init!` macro allowing to pin-initialize a struct on the stack. 6. Adds the `Zeroable` trait and `init::zeroed` function to initialize types that have `0x00` in all bytes as a valid bit pattern. -- In this section the problem that the new pin-init API solves is outlined. This message describes the entirety of the API, not just the parts introduced in this commit. For a more granular explanation and additional information on pinning and this issue, view [1]. Pinning is Rust's way of enforcing the address stability of a value. When a value gets pinned it will be impossible for safe code to move it to another location. This is done by wrapping pointers to said object with `Pin<P>`. This wrapper prevents safe code from creating mutable references to the object, preventing mutable access, which is needed to move the value. `Pin<P>` provides `unsafe` functions to circumvent this and allow modifications regardless. It is then the programmer's responsibility to uphold the pinning guarantee. Many kernel data structures require a stable address, because there are foreign pointers to them which would get invalidated by moving the structure. Since these data structures are usually embedded in structs to use them, this pinning property propagates to the container struct. Resulting in most structs in both Rust and C code needing to be pinned. So if we want to have a `mutex` field in a Rust struct, this struct also needs to be pinned, because a `mutex` contains a `list_head`. Additionally initializing a `list_head` requires already having the final memory location available, because it is initialized by pointing it to itself. But this presents another challenge in Rust: values have to be initialized at all times. There is the `MaybeUninit<T>` wrapper type, which allows handling uninitialized memory, but this requires using the `unsafe` raw pointers and a casting the type to the initialized variant. This problem gets exacerbated when considering encapsulation and the normal safety requirements of Rust code. The fields of the Rust `Mutex<T>` should not be accessible to normal driver code. After all if anyone can modify the fields, there is no way to ensure the invariants of the `Mutex<T>` are upheld. But if the fields are inaccessible, then initialization of a `Mutex<T>` needs to be somehow achieved via a function or a macro. Because the `Mutex<T>` must be pinned in memory, the function cannot return it by value. It also cannot allocate a `Box` to put the `Mutex<T>` into, because that is an unnecessary allocation and indirection which would hurt performance. The solution in the rust tree (e.g. this commit: [2]) that is replaced by this API is to split this function into two parts: 1. A `new` function that returns a partially initialized `Mutex<T>`, 2. An `init` function that requires the `Mutex<T>` to be pinned and that fully initializes the `Mutex<T>`. Both of these functions have to be marked `unsafe`, since a call to `new` needs to be accompanied with a call to `init`, otherwise using the `Mutex<T>` could result in UB. And because calling `init` twice also is not safe. While `Mutex<T>` initialization cannot fail, other structs might also have to allocate memory, which would result in conditional successful initialization requiring even more manual accommodation work. Combine this with the problem of pin-projections -- the way of accessing fields of a pinned struct -- which also have an `unsafe` API, pinned initialization is riddled with `unsafe` resulting in very poor ergonomics. Not only that, but also having to call two functions possibly multiple lines apart makes it very easy to forget it outright or during refactoring. Here is an example of the current way of initializing a struct with two synchronization primitives (see [3] for the full example): struct SharedState { state_changed: CondVar, inner: Mutex<SharedStateInner>, } impl SharedState { fn try_new() -> Result<Arc<Self>> { let mut state = Pin::from(UniqueArc::try_new(Self { // SAFETY: `condvar_init!` is called below. state_changed: unsafe { CondVar::new() }, // SAFETY: `mutex_init!` is called below. inner: unsafe { Mutex::new(SharedStateInner { token_count: 0 }) }, })?); // SAFETY: `state_changed` is pinned when `state` is. let pinned = unsafe { state.as_mut().map_unchecked_mut(|s| &mut s.state_changed) }; kernel::condvar_init!(pinned, "SharedState::state_changed"); // SAFETY: `inner` is pinned when `state` is. let pinned = unsafe { state.as_mut().map_unchecked_mut(|s| &mut s.inner) }; kernel::mutex_init!(pinned, "SharedState::inner"); Ok(state.into()) } } The pin-init API of this patch solves this issue by providing a comprehensive solution comprised of macros and traits. Here is the example from above using the pin-init API: #[pin_data] struct SharedState { #[pin] state_changed: CondVar, #[pin] inner: Mutex<SharedStateInner>, } impl SharedState { fn new() -> impl PinInit<Self> { pin_init!(Self { state_changed <- new_condvar!("SharedState::state_changed"), inner <- new_mutex!( SharedStateInner { token_count: 0 }, "SharedState::inner", ), }) } } Notably the way the macro is used here requires no `unsafe` and thus comes with the usual Rust promise of safe code not introducing any memory violations. Additionally it is now up to the caller of `new()` to decide the memory location of the `SharedState`. They can choose at the moment `Arc<T>`, `Box<T>` or the stack. -- The API has the following architecture: 1. Initializer traits `PinInit<T, E>` and `Init<T, E>` that act like closures. 2. Macros to create these initializer traits safely. 3. Functions to allow manually writing initializers. The initializers (an `impl PinInit<T, E>`) receive a raw pointer pointing to uninitialized memory and their job is to fully initialize a `T` at that location. If initialization fails, they return an error (`E`) by value. This way of initializing cannot be safely exposed to the user, since it relies upon these properties outside of the control of the trait: - the memory location (slot) needs to be valid memory, - if initialization fails, the slot should not be read from, - the value in the slot should be pinned, so it cannot move and the memory cannot be deallocated until the value is dropped. This is why using an initializer is facilitated by another trait that ensures these requirements. These initializers can be created manually by just supplying a closure that fulfills the same safety requirements as `PinInit<T, E>`. But this is an `unsafe` operation. To allow safe initializer creation, the `pin_init!` is provided along with three other variants: `try_pin_init!`, `try_init!` and `init!`. These take a modified struct initializer as a parameter and generate a closure that initializes the fields in sequence. The macros take great care in upholding the safety requirements: - A shadowed struct type is used as the return type of the closure instead of `()`. This is to prevent early returns, as these would prevent full initialization. - To ensure every field is only initialized once, a normal struct initializer is placed in unreachable code. The type checker will emit errors if a field is missing or specified multiple times. - When initializing a field fails, the whole initializer will fail and automatically drop fields that have been initialized earlier. - Only the correct initializer type is allowed for unpinned fields. You cannot use a `impl PinInit<T, E>` to initialize a structurally not pinned field. To ensure the last point, an additional macro `#[pin_data]` is needed. This macro annotates the struct itself and the user specifies structurally pinned and not pinned fields. Because dropping a pinned struct is also not allowed to break the pinning invariants, another macro attribute `#[pinned_drop]` is needed. This macro is introduced in a following commit. These two macros also have mechanisms to ensure the overall safety of the API. Additionally, they utilize a combined proc-macro, declarative macro design: first a proc-macro enables the outer attribute syntax `#[...]` and does some important pre-parsing. Notably this prepares the generics such that the declarative macro can handle them using token trees. Then the actual parsing of the structure and the emission of code is handled by a declarative macro. For pin-projections the crates `pin-project` [4] and `pin-project-lite` [5] had been considered, but were ultimately rejected: - `pin-project` depends on `syn` [6] which is a very big dependency, around 50k lines of code. - `pin-project-lite` is a more reasonable 5k lines of code, but contains a very complex declarative macro to parse generics. On top of that it would require modification that would need to be maintained independently. Link: https://rust-for-linux.com/the-safe-pinned-initialization-problem [1] Link: https://github.com/Rust-for-Linux/linux/tree/0a04dc4ddd671efb87eef54dde0fb38e9074f4be [2] Link: https://github.com/Rust-for-Linux/linux/blob/f509ede33fc10a07eba3da14aa00302bd4b5dddd/samples/rust/rust_miscdev.rs [3] Link: https://crates.io/crates/pin-project [4] Link: https://crates.io/crates/pin-project-lite [5] Link: https://crates.io/crates/syn [6] Co-developed-by: Gary Guo <[email protected]> Signed-off-by: Gary Guo <[email protected]> Signed-off-by: Benno Lossin <[email protected]> Reviewed-by: Alice Ryhl <[email protected]> Reviewed-by: Wedson Almeida Filho <[email protected]> Reviewed-by: Andreas Hindborg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

Add the following initializer macros: - `#[pin_data]` to annotate structurally pinned fields of structs, needed for `pin_init!` and `try_pin_init!` to select the correct initializer of fields. - `pin_init!` create a pin-initializer for a struct with the `Infallible` error type. - `try_pin_init!` create a pin-initializer for a struct with a custom error type (`kernel::error::Error` is the default). - `init!` create an in-place-initializer for a struct with the `Infallible` error type. - `try_init!` create an in-place-initializer for a struct with a custom error type (`kernel::error::Error` is the default). Also add their needed internal helper traits and structs. Co-developed-by: Gary Guo <[email protected]> Signed-off-by: Gary Guo <[email protected]> Signed-off-by: Benno Lossin <[email protected]> Reviewed-by: Alice Ryhl <[email protected]> Reviewed-by: Andreas Hindborg <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Fixed three typos. ] Signed-off-by: Miguel Ojeda <[email protected]>

…ters The `InPlaceInit` trait that provides two functions, for initializing using `PinInit<T, E>` and `Init<T>`. It is implemented by `Arc<T>`, `UniqueArc<T>` and `Box<T>`. Signed-off-by: Benno Lossin <[email protected]> Reviewed-by: Alice Ryhl <[email protected]> Reviewed-by: Gary Guo <[email protected]> Reviewed-by: Andreas Hindborg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

The `PinnedDrop` trait that facilitates destruction of pinned types. It has to be implemented via the `#[pinned_drop]` macro, since the `drop` function should not be called by normal code, only by other destructors. It also only works on structs that are annotated with `#[pin_data(PinnedDrop)]`. Co-developed-by: Gary Guo <[email protected]> Signed-off-by: Gary Guo <[email protected]> Signed-off-by: Benno Lossin <[email protected]> Reviewed-by: Alice Ryhl <[email protected]> Reviewed-by: Andreas Hindborg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

The `stack_pin_init!` macro allows pin-initializing a value on the stack. It accepts a `impl PinInit<T, E>` to initialize a `T`. It allows propagating any errors via `?` or handling it normally via `match`. Signed-off-by: Benno Lossin <[email protected]> Reviewed-by: Alice Ryhl <[email protected]> Reviewed-by: Andreas Hindborg <[email protected]> Reviewed-by: Gary Guo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

Add the `Zeroable` trait which marks types that can be initialized by writing `0x00` to every byte of the type. Also add the `init::zeroed` function that creates an initializer for a `Zeroable` type that writes `0x00` to every byte. Signed-off-by: Benno Lossin <[email protected]> Reviewed-by: Alice Ryhl <[email protected]> Reviewed-by: Gary Guo <[email protected]> Reviewed-by: Andreas Hindborg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

Add `pin-init` API macros and traits to the prelude. Signed-off-by: Benno Lossin <[email protected]> Reviewed-by: Gary Guo <[email protected]> Reviewed-by: Alice Ryhl <[email protected]> Reviewed-by: Andreas Hindborg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

This function allows to easily initialize `Opaque` with the pin-init API. `Opaque::ffi_init` takes a closure and returns a pin-initializer. This pin-initiailizer calls the given closure with a pointer to the inner `T`. Co-developed-by: Gary Guo <[email protected]> Signed-off-by: Gary Guo <[email protected]> Signed-off-by: Benno Lossin <[email protected]> Reviewed-by: Andreas Hindborg <[email protected]> Reviewed-by: Alice Ryhl <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Fixed typo. ] Signed-off-by: Miguel Ojeda <[email protected]>

`UniqueArc::try_new_uninit` calls `Arc::try_new(MaybeUninit::uninit())`. This results in the uninitialized memory being placed on the stack, which may be arbitrarily large due to the generic `T` and thus could cause a stack overflow for large types. Change the implementation to use the pin-init API which enables in-place initialization. In particular it avoids having to first construct and then move the uninitialized memory from the stack into the final location. Signed-off-by: Benno Lossin <[email protected]> Reviewed-by: Alice Ryhl <[email protected]> Reviewed-by: Gary Guo <[email protected]> Reviewed-by: Andreas Hindborg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

Add two functions `init_with` and `pin_init_with` to `UniqueArc<MaybeUninit<T>>` to initialize the memory of already allocated `UniqueArc`s. This is useful when you want to allocate memory check some condition inside of a context where allocation is forbidden and then conditionally initialize an object. Signed-off-by: Benno Lossin <[email protected]> Reviewed-by: Gary Guo <[email protected]> Reviewed-by: Alice Ryhl <[email protected]> Reviewed-by: Andreas Hindborg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

This makes it possible to use `T` as a `impl Init<T, E>` for every error type `E` instead of just `Infallible`. Signed-off-by: Benno Lossin <[email protected]> Reviewed-by: Gary Guo <[email protected]> Reviewed-by: Martin Rodriguez Reboredo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

Benno has been involved with the Rust for Linux project for the better part of a year now. He has been working on solving the safe pinned initialization problem [1], which resulted in the pin-init API patch series [2] that allows to reduce the need for `unsafe` code in the kernel. He is also working on the field projection RFC for Rust [3] to bring pin-init as a language feature. His expertise with the language will be very useful to have around in the future if Rust grows within the kernel, thus add him to the `RUST` entry as reviewer. Link: https://rust-for-linux.com/the-safe-pinned-initialization-problem [1] Link: https://lore.kernel.org/rust-for-linux/[email protected]/ [2] Link: rust-lang/rfcs#3318 [3] Cc: Benno Lossin <[email protected]> Reviewed-by: Alex Gaynor <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

It is a wrapper around C's `lock_class_key`, which is used by the synchronisation primitives that are checked with lockdep. This is in preparation for introducing Rust abstractions for these primitives. Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Will Deacon <[email protected]> Cc: Waiman Long <[email protected]> Reviewed-by: Martin Rodriguez Reboredo <[email protected]> Co-developed-by: Boqun Feng <[email protected]> Signed-off-by: Boqun Feng <[email protected]> Signed-off-by: Wedson Almeida Filho <[email protected]> Reviewed-by: Gary Guo <[email protected]> Reviewed-by: Benno Lossin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

They are generic Rust implementations of a lock and a lock guard that contain code that is common to all locks. Different backends will be introduced in subsequent commits. Reviewed-by: Martin Rodriguez Reboredo <[email protected]> Suggested-by: Gary Guo <[email protected]> Signed-off-by: Wedson Almeida Filho <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Fixed typo. ] Signed-off-by: Miguel Ojeda <[email protected]>

This is the `struct mutex` lock backend and allows Rust code to use the kernel mutex idiomatically. Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Will Deacon <[email protected]> Cc: Waiman Long <[email protected]> Reviewed-by: Martin Rodriguez Reboredo <[email protected]> Signed-off-by: Wedson Almeida Filho <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

This is the `spinlock_t` lock backend and allows Rust code to use the kernel spinlock idiomatically. Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Will Deacon <[email protected]> Cc: Waiman Long <[email protected]> Reviewed-by: Martin Rodriguez Reboredo <[email protected]> Signed-off-by: Wedson Almeida Filho <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

This is an owned reference to an object that is always ref-counted. This is meant to be used in wrappers for C types that have their own ref counting functions, for example, tasks, files, inodes, dentries, etc. Reviewed-by: Martin Rodriguez Reboredo <[email protected]> Signed-off-by: Wedson Almeida Filho <[email protected]> Reviewed-by: Gary Guo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

It is an abstraction for C's `struct task_struct`. It implements `AlwaysRefCounted`, so the refcount of the wrapped object is managed safely on the Rust side. Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Reviewed-by: Martin Rodriguez Reboredo <[email protected]> Signed-off-by: Wedson Almeida Filho <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

This allows Rust code to get a reference to the current task without having to increment the refcount, but still guaranteeing memory safety. Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Reviewed-by: Martin Rodriguez Reboredo <[email protected]> Signed-off-by: Wedson Almeida Filho <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

This allows us to have data protected by a lock despite not being wrapped by it. Access is granted by providing evidence that the lock is held by the caller. Reviewed-by: Martin Rodriguez Reboredo <[email protected]> Signed-off-by: Wedson Almeida Filho <[email protected]> Reviewed-by: Benno Lossin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

It releases the lock, executes some function provided by the caller, then reacquires the lock. This is preparation for the implementation of condvars, which will sleep after between unlocking and relocking. We need an explicit `relock` method for primitives like `SpinLock` that have an irqsave variant: we use the guard state to determine if the lock was originally acquired with the regular `lock` function or `lock_irqsave`. Reviewed-by: Martin Rodriguez Reboredo <[email protected]> Signed-off-by: Wedson Almeida Filho <[email protected]> Link: https://lore.kernel.org/rust-for-linux/[email protected]/ [ Removed the irqsave bits as discussed. ] Signed-off-by: Miguel Ojeda <[email protected]>

This is the traditional condition variable or monitor synchronisation primitive. It is implemented with C's `wait_queue_head_t`. It allows users to release a lock and go to sleep while guaranteeing that notifications won't be missed. This is achieved by enqueuing a wait entry before releasing the lock. Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Will Deacon <[email protected]> Cc: Waiman Long <[email protected]> Reviewed-by: Martin Rodriguez Reboredo <[email protected]> Signed-off-by: Wedson Almeida Filho <[email protected]> Reviewed-by: Alice Ryhl <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

This crate mirrors the `bindings` crate, but will contain only UAPI bindings. Unlike the bindings crate, drivers may directly use this crate if they have to interface with userspace. Initially, just bind the generic ioctl stuff. In the future, we would also like to add additional checks to ensure that all types exposed by this crate satisfy UAPI-safety guarantees (that is, they are safely castable to/from a "bag of bits"). [ Miguel: added support for the `rustdoc` and `rusttest` targets, since otherwise they fail, and we want to keep them working. ] Reviewed-by: Martin Rodriguez Reboredo <[email protected]> Signed-off-by: Asahi Lina <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

Add simple 1:1 wrappers of the C ioctl number manipulation functions. Since these are macros we cannot bindgen them directly, and since they should be usable in const context we cannot use helper wrappers, so we'll have to reimplement them in Rust. Thankfully, the C headers do declare defines for the relevant bitfield positions, so we don't need to duplicate that. Signed-off-by: Asahi Lina <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Moved the `#![allow(non_snake_case)]` to the usual place. ] Signed-off-by: Miguel Ojeda <[email protected]>

This commit provides the build flags for Rust for AArch64. The core Rust support already in the kernel does the rest. The Rust samples have been tested with this commit. [jcunliffe: Arm specific parts taken from Miguel's upstream tree] Signed-off-by: Miguel Ojeda <[email protected]> Co-developed-by: Jamie Cunliffe <[email protected]> Signed-off-by: Jamie Cunliffe <[email protected]> Signed-off-by: Vincenzo Palazzo <[email protected]> Reviewed-by: Vincenzo Palazzo <[email protected]> Reviewed-by: Gary Guo <[email protected]>

Enable the PAC ret and BTI options in the Rust build flags to match the options that are used when building C. Signed-off-by: Jamie Cunliffe <[email protected]> Reviewed-by: Vincenzo Palazzo <[email protected]>

Disable the neon and fp target features to avoid fp & simd registers. The use of fp-armv8 will cause a warning from rustc about an unknown feature that is specified. The target feature is still passed through to LLVM, this behaviour is documented as part of the warning. This will be fixed in a future version of the rustc toolchain. Signed-off-by: Jamie Cunliffe <[email protected]> Reviewed-by: Vincenzo Palazzo <[email protected]>

This module is intended to contain functions related to kernel timekeeping and time. Initially, this just wraps ktime_get() and ktime_get_boottime() and returns them as core::time::Duration instances. Signed-off-by: Asahi Lina <[email protected]>

Signed-off-by: Eileen Yoon <[email protected]>

[ Upstream commit 89a906d ] Floating point instructions in userspace can crash some arm kernels built with clang/LLD 17.0.6: BUG: unsupported FP instruction in kernel mode FPEXC == 0xc0000780 Internal error: Oops - undefined instruction: 0 [#1] ARM CPU: 0 PID: 196 Comm: vfp-reproducer Not tainted 6.10.0 #1 Hardware name: BCM2835 PC is at vfp_support_entry+0xc8/0x2cc LR is at do_undefinstr+0xa8/0x250 pc : [<c0101d50>] lr : [<c010a80c>] psr: a0000013 sp : dc8d1f68 ip : 60000013 fp : bedea19c r10: ec532b17 r9 : 00000010 r8 : 0044766c r7 : c0000780 r6 : ec532b17 r5 : c1c13800 r4 : dc8d1fb0 r3 : c10072c4 r2 : c0101c88 r1 : ec532b17 r0 : 0044766c Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 00c5387d Table: 0251c008 DAC: 00000051 Register r0 information: non-paged memory Register r1 information: vmalloc memory Register r2 information: non-slab/vmalloc memory Register r3 information: non-slab/vmalloc memory Register r4 information: 2-page vmalloc region Register r5 information: slab kmalloc-cg-2k Register r6 information: vmalloc memory Register r7 information: non-slab/vmalloc memory Register r8 information: non-paged memory Register r9 information: zero-size pointer Register r10 information: vmalloc memory Register r11 information: non-paged memory Register r12 information: non-paged memory Process vfp-reproducer (pid: 196, stack limit = 0x61aaaf8b) Stack: (0xdc8d1f68 to 0xdc8d2000) 1f60: 0000081f b6f69300 0000000f c10073f4 c10072c4 dc8d1fb0 1f80: ec532b17 0c532b17 0044766c b6f9ccd8 00000000 c010a80c 00447670 60000010 1fa0: ffffffff c1c13800 00c5387d c0100f10 b6f68af8 00448fc0 00000000 bedea188 1fc0: bedea314 00000001 00448ebc b6f9d000 00447608 b6f9ccd8 00000000 bedea19c 1fe0: bede9198 bedea188 b6e1061c 0044766c 60000010 ffffffff 00000000 00000000 Call trace: [<c0101d50>] (vfp_support_entry) from [<c010a80c>] (do_undefinstr+0xa8/0x250) [<c010a80c>] (do_undefinstr) from [<c0100f10>] (__und_usr+0x70/0x80) Exception stack(0xdc8d1fb0 to 0xdc8d1ff8) 1fa0: b6f68af8 00448fc0 00000000 bedea188 1fc0: bedea314 00000001 00448ebc b6f9d000 00447608 b6f9ccd8 00000000 bedea19c 1fe0: bede9198 bedea188 b6e1061c 0044766c 60000010 ffffffff Code: 0a000061 e3877202 e594003c e3a09010 (eef16a10) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception in interrupt ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- This is a minimal userspace reproducer on a Raspberry Pi Zero W: #include <stdio.h> #include <math.h> int main(void) { double v = 1.0; printf("%fn", NAN + *(volatile double *)&v); return 0; } Another way to consistently trigger the oops is: calvin@raspberry-pi-zero-w ~$ python -c "import json" The bug reproduces only when the kernel is built with DYNAMIC_DEBUG=n, because the pr_debug() calls act as barriers even when not activated. This is the output from the same kernel source built with the same compiler and DYNAMIC_DEBUG=y, where the userspace reproducer works as expected: VFP: bounce: trigger ec532b17 fpexc c0000780 VFP: emulate: INST=0xee377b06 SCR=0x00000000 VFP: bounce: trigger eef1fa10 fpexc c0000780 VFP: emulate: INST=0xeeb40b40 SCR=0x00000000 VFP: raising exceptions 30000000 calvin@raspberry-pi-zero-w ~$ ./vfp-reproducer nan Crudely grepping for vmsr/vmrs instructions in the otherwise nearly idential text for vfp_support_entry() makes the problem obvious: vmlinux.llvm.good [0xc0101cb8] <+48>: vmrs r7, fpexc vmlinux.llvm.good [0xc0101cd8] <+80>: vmsr fpexc, r0 vmlinux.llvm.good [0xc0101d20] <+152>: vmsr fpexc, r7 vmlinux.llvm.good [0xc0101d38] <+176>: vmrs r4, fpexc vmlinux.llvm.good [0xc0101d6c] <+228>: vmrs r0, fpscr vmlinux.llvm.good [0xc0101dc4] <+316>: vmsr fpexc, r0 vmlinux.llvm.good [0xc0101dc8] <+320>: vmrs r0, fpsid vmlinux.llvm.good [0xc0101dcc] <+324>: vmrs r6, fpscr vmlinux.llvm.good [0xc0101e10] <+392>: vmrs r10, fpinst vmlinux.llvm.good [0xc0101eb8] <+560>: vmrs r10, fpinst2 vmlinux.llvm.bad [0xc0101cb8] <+48>: vmrs r7, fpexc vmlinux.llvm.bad [0xc0101cd8] <+80>: vmsr fpexc, r0 vmlinux.llvm.bad [0xc0101d20] <+152>: vmsr fpexc, r7 vmlinux.llvm.bad [0xc0101d30] <+168>: vmrs r0, fpscr vmlinux.llvm.bad [0xc0101d50] <+200>: vmrs r6, fpscr <== BOOM! vmlinux.llvm.bad [0xc0101d6c] <+228>: vmsr fpexc, r0 vmlinux.llvm.bad [0xc0101d70] <+232>: vmrs r0, fpsid vmlinux.llvm.bad [0xc0101da4] <+284>: vmrs r10, fpinst vmlinux.llvm.bad [0xc0101df8] <+368>: vmrs r4, fpexc vmlinux.llvm.bad [0xc0101e5c] <+468>: vmrs r10, fpinst2 I think LLVM's reordering is valid as the code is currently written: the compiler doesn't know the instructions have side effects in hardware. Fix by using "asm volatile" in fmxr() and fmrx(), so they cannot be reordered with respect to each other. The original compiler now produces working kernels on my hardware with DYNAMIC_DEBUG=n. This is the relevant piece of the diff of the vfp_support_entry() text, from the original oopsing kernel to a working kernel with this patch: vmrs r0, fpscr tst r0, #4096 bne 0xc0101d48 tst r0, #458752 beq 0xc0101ecc orr r7, r7, #536870912 ldr r0, [r4, #0x3c] mov r9, #16 -vmrs r6, fpscr orr r9, r9, #251658240 add r0, r0, #4 str r0, [r4, #0x3c] mvn r0, #159 sub r0, r0, #-1207959552 and r0, r7, r0 vmsr fpexc, r0 vmrs r0, fpsid +vmrs r6, fpscr and r0, r0, #983040 cmp r0, #65536 bne 0xc0101d88 Fixes: 4708fb0 ("ARM: vfp: Reimplement VFP exception entry in C code") Signed-off-by: Calvin Owens <[email protected]> Signed-off-by: Russell King (Oracle) <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

Benno Lossin and others added 30 commits May 30, 2023 20:27

arm64: rust: Enable PAC support for Rust.

f9caa83

Enable the PAC ret and BTI options in the Rust build flags to match the options that are used when building C. Signed-off-by: Jamie Cunliffe <[email protected]> Reviewed-by: Vincenzo Palazzo <[email protected]>

marcan and others added 17 commits June 17, 2023 22:20

Merge branch 'refs/heads/bits/060-spi' into asahi-wip

c422d00

Merge branch 'refs/heads/bits/070-audio' into asahi-wip

d0a0d24

Merge branch 'refs/heads/bits/080-wifi' into asahi-wip

c25de5b

Merge branch 'refs/heads/bits/090-spi-hid' into asahi-wip

a5aae36

Merge branch 'refs/heads/bits/110-smc' into asahi-wip

68e542e

Merge branch 'refs/heads/bits/120-spmi' into asahi-wip

ed35d4f

Merge branch 'refs/heads/bits/130-cpufreq' into asahi-wip

89798f4

Merge branch 'refs/heads/bits/140-pci' into asahi-wip

0a69939

Merge branch 'refs/heads/bits/150-xhci-firmware' into asahi-wip

173d792

Merge branch 'refs/heads/bits/160-fpwm' into asahi-wip

09b1186

Merge branch 'refs/heads/bits/170-atcphy' into asahi-wip

7d0e807

Merge branch 'refs/heads/bits/200-dcp' into asahi-wip

fd609be

Merge branch 'refs/heads/bits/210-gpu' into asahi-wip

70aab69

Merge branch 'refs/heads/bits/220-tso' into asahi-wip

b5c05cb

arm64: dts: apple: t8103: add ANE bindings

6027c18

Signed-off-by: Eileen Yoon <[email protected]>

arm64: dts: apple: t600x: add ANE bindings

297491e

Signed-off-by: Eileen Yoon <[email protected]>

accel: ane: add driver for the Apple Neural Engine

26d8a0f

Signed-off-by: Eileen Yoon <[email protected]>

XY-cpp mentioned this pull request Jan 9, 2024

When will NPU be supported? #255

Open

marcan force-pushed the asahi branch from b5c05cb to bd0a1a7 Compare January 12, 2024 01:40

jannau force-pushed the asahi branch from bd0a1a7 to 752b1ed Compare May 28, 2024 20:39

jannau force-pushed the asahi branch from 752b1ed to 77dd102 Compare June 22, 2024 08:20

jannau force-pushed the asahi branch from 77dd102 to ceb4e0a Compare July 15, 2024 20:03

jannau force-pushed the asahi branch 2 times, most recently from 539cbbc to 83a2784 Compare July 29, 2024 11:01

jannau force-pushed the asahi branch from e93339b to ccc12d2 Compare August 19, 2024 18:41

jannau force-pushed the asahi branch from ccc12d2 to 3e83546 Compare September 10, 2024 21:37

jannau force-pushed the asahi branch from 3416d14 to de1c5a8 Compare September 27, 2024 06:25

jannau force-pushed the asahi branch 2 times, most recently from 7c16dea to 1543478 Compare November 17, 2024 17:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ANE support #159

ANE support #159

eiln commented Jun 21, 2023

ANE support #159

Are you sure you want to change the base?

ANE support #159

Conversation

eiln commented Jun 21, 2023